Comparison of Unsupervised Anomaly Detection Techniques

نویسندگان

  • Markus Goldstein
  • Andreas Dengel
چکیده

Anomaly Detection is the process of finding outlying record from a given data set. This problem has been of increasing importance due to the increase in the size of data and the need to efficiently extract those outlying records which could indicate unauthorized access of the system, credit card theft or the diagnosis of a disease. The aim of this bachelor thesis is to implement a RapidMiner extension that contains the most applicable unsupervised anomaly detection algorithms to enable non-experts to easily apply them. Second an evaluation of the implemented algorithms was carried out in an attempt to show the relative strength and weakness of the algorithms. Two new algorithms were introduced. The first one is a global variant of cluster-based local outlier factor (CBLOF) which tries to overcome its shortcomings, the second one is local density based algorithm called local density cluster-based outlier factor. The performance of the implemented and the proposed algorithms were evaluated on real world data sets from the UCI machine learning repository. The proposed algorithms showed promising results where they outperformed CBLOF.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning Intrusion Detection: Supervised or Unsupervised?

Application and development of specialized machine learning techniques is gaining increasing attention in the intrusion detection community. A variety of learning techniques proposed for different intrusion detection problems can be roughly classified into two broad categories: supervised (classification) and unsupervised (anomaly detection and clustering). In this contribution we develop an ex...

متن کامل

Sub-Space Clustering, Inter-Clustering Results Association & Anomaly Correlation for Unsupervised Network Anomaly Detection

Network anomaly detection is a critical aspect of network management for instance for QoS, security, etc. The continuous arising of new anomalies and attacks create a continuous challenge to cope with events that put the network integrity at risk. Most network anomaly detection systems proposed so far employ a supervised strategy to accomplish the task, using either signature-based detection me...

متن کامل

Sub-Space Clustering and Evidence Accumulation for Unsupervised Network Anomaly Detection

Network anomaly detection has been a hot research topic for many years. Most detection systems proposed so far employ a supervised strategy to accomplish the task, using either signature-based detection methods or supervised-learning techniques. However, both approaches present major limitations: the former fails to detect unknown anomalies, the latter requires training and labeled traffic, whi...

متن کامل

Intrusion Detection Using Graph Support: A Hybrid Approach of Supervised and Unsupervised Techniques

At present it is almost impossible to detect zero day attack with help of supervised anomaly detection methods. Unsupervised techniques also have the drawback of low detection rate in spite of detection of zero day attacks. Using combination of both unsupervised and supervised methods, promising detection results can be produced. In this paper we present a new sequence based graph support techn...

متن کامل

Machine Learning for Host-based Anomaly Detection

Machine Learning for Host-based Anomaly Detection by Gaurav Tandon Dissertation Advisor: Philip K. Chan, Ph.D. Anomaly detection techniques complement signature based methods for intrusion detection. Machine learning approaches are applied to anomaly detection for automated learning and detection. Traditional host-based anomaly detectors model system call sequences to detect novel attacks. This...

متن کامل

Moving dispersion method for statistical anomaly detection in intrusion detection systems

A unified method for statistical anomaly detection in intrusion detection systems is theoretically introduced. It is based on estimating a dispersion measure of numerical or symbolic data on successive moving windows in time and finding the times when a relative change of the dispersion measure is significant. Appropriate dispersion measures, relative differences, moving windows, as well as tec...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011